CLUSTERED RADIO INTERFEROMETRIC CALIBRATION 
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ABSTRACT 

This paper introduces an amendment to radio interferometric 
calibration of sources below the noise level. The main idea is 
to employ the information of the stronger sources' measured 
signals as a plug-in criterion to solve for the weaker ones. For 
this purpose, we construct a number of source clusters, with 
centroids mainly near the strongest sources, assuming that the 
signals of the sources belonging to a single cluster are cor- 
rupted by almost the same errors. Due to this characteristic 
of clusters, each cluster is calibrated as a single source, us- 
ing all the coherencies of its sources simultaneously. The ob- 
tained solutions for every cluster are assigned to all the clus- 
ter's sources. An illustrative example reveals the superiority 
of this calibration compared to the un-clustered calibration. 

Index Terms — Calibration, Clustering methods. Cluster- 
ing algorithms, Interferometry: Radio interferometry 

1. INTRODUCTION 

Calibration of radio synthesis arrays refers to the estimation 
and reduction of errors introduced by the atmosphere and the 
incorporated instruments, before imaging. It is the most cru- 
cial task in order to achieve the interferometer's desired pre- 
cision and sensitivity. Early radio astronomy used external 
(classical) calibration which is based on estimating the in- 
strument unknown parameters by a celestial radio source with 
known properties. The external calibration is then improved 
by self-calibration [11, which utilizes only the observed data 
for estimating both the source and instrumental unknowns. 

Although the calibration of radio telescopes highly bene- 
fits from various self-calibration techniques, its performance 
in interferometric source subtraction is still limited to sources 
that have a high enough Signal to Noise Ratio (SNR) to be 
distinguished from the background noise ||2] [3] @1 . The nov- 
elty of this paper is that the presented method has a high per- 
formance in source calibration below the noise level, utiliz- 
ing the strongest sources' signals. The implementation of 
such a calibration, termed as "clustered calibration", is per- 
formed by clustering the sources during the calibration pro- 
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cess. The clustered calibration improves the information used 
for calculating solutions by incorporating the total of signals 
observed at each cluster instead of each individual source's 
signal. Thus, in the case of calibrating the low signals of very 
weak sources, it provides a considerably better result com- 
pared with the un-clustered calibration. 

Clustering, from the data mining point of view, can be 
defined as the gathering of similar data points together into 
groups. An overview of different clustering methods is given 
by [5 1. We intend to cluster radio sources in the way that vis- 
ibiUties of sources belonging to a single cluster are affected 
by almost the same errors, and subsequently could share the 
same calibration's solutions. This assumption is valid when 
the clusters' angular diameters are small enough so that the 
variation in their actual solutions is negligible. After arrang- 
ing the cluster's centroids mainly near the strongest sources, 
we calibrate considering every cluster as a single source, and 
assign the obtained solutions to all the cluster's sources. 

We present the data model of clustered calibration and 
apply two clustering methods: (i) weighted K-means 161 |7l 
and (ii) divisive hierarchical clustering [8|, to cluster the 
sky's sources. In an illustrative example, we demonstrate 
the superiority of the clustered calibration compared to the 
un-clustered calibration, in solving for the sources bellow 
the noise level, using data observed by the LOw Frequency 
ARray (LOFAR) synthesis radio telescope 0. 

The following notations are used in this paper: Bold, 
lower case letters refer to column vectors, e.g., y. Upper 
case bold letters refer to matrices, e.g., C. All parameters are 
complex numbers, unless stated otherwise. The transpose 
and conjugation of a matrix are presented by (.)^ and (.)*, 
respectively. The matrix Kronecker product, the set member- 
ship, the empty set, and the union operator are denoted by ®, 
G, 0, and U, respectively. 

2. CLUSTERED CALIBRATION DATA MODEL 

In this section, we briefly describe the data model of the radio 
interferometric clustered calibration. For more details on ra- 
dio interferometry the reader is referred to [9 and for the data 



^http://www.lofar.org 



model of radio interferometric calibration to [fT0][Tn[T2| . 

Consider an interferometric array consisting of N re- 
ceivers, each with two orthogonal polarized feeds X and 
Y. The induced voltages at the p-th receiver's feeds, Vpi = 
[vxpi VYpi]'^, due to the polarized waves radiated by the i-th 
source, = [exi ey^]^, is given by 



(1) 



In Eq. (dJ, J pi represents the 2x2 Jones matrix fTO^, corre- 
sponding to corruptions of signal at receiver p, which is a 
product of different Jones matrices as [ 12 1 



J pi — GpEpiZp^Fp^Kp^. 



(2) 



In Eq. K-pi, E^^, Zpi, and Fpi are the Fourier transform, 
antenna's voltage pattern, ionospheric phase fluctuation, and 
Faraday Rotation matrices corresponding to the z-th source's 
direction and receiver p's location on the earth, respectively, 
and Gp is the receiver p's clock phase and electronic gain. 

We assume that the total of signals seen at each receiver 
is a superposition of the K sources' corrupted signals, plus 
the receiver's thermal noise. Note that the multitude of the 
ignored fainter sources also contribute to the noise. After cor- 
recting the geometric delays corresponding to receivers' loca- 
tions on the earth, we correlate the collected signals at every 
pair of receivers to obtain visibilities (T0\. Since the complex 
gain of Jones matrix G does not depend on the source direc- 
tion, it is initially calculated at every receiver and then the 
visibilities are corrected for it. Stacking up all the corrected 
visibilities in vector y, we arrive to the general data model of 
and as 

K 

y = ^s, + n. (3) 

i=i 

In Eq. ([3]), n is the additive noise vector, normally assumed 
to be Gaussian white noise. The nonlinear function shows 
the contribution of the z-th source in the observation: 



J^. (g) JHvec(C{i2}i) 

_ Jat^ ® J(7V-l)iVec(C{(7v_i)7V}i 



where 



Jpi — 'Fpi'^pi^pii ^{pq}i — ^^{pq}i^ii 



(4) 



Ci is the source's coherency matrix [flS] [lOl , and the scalar 
Jones matrix K^pgi^ corresponds to Fourier transform be- 
tween the source's direction and the baseline pq. 

The calibration is essentially an estimation of the J Jones 
matrix, and the removal of the K brightest sources. How- 
ever, in practice, the E, Z, and F Jones matrices obtained for 
nearby directions and for a given receiver are almost the same. 
Thus, for every receiver p, if the i-th and j-th sources have a 
small angular separation from each other we have 



Eq. ^ is the underlying assumption for clustered calibration 
and it tells us that the Fourier transform K Jones matrix is the 
only Jones matrix which should be calculated individually for 
all directions. 

Assume that we have Q (Q <C K) source clusters Li, for 
i G {1, . . . ,Q}, with small enough angular diameters. Based 
on Eq. (|5]), for every cluster Li, we define 



{pq}i 



{pq}h 



(6) 



and by substituting this new definition in Eq. ([3]), we formu- 
late the clustered calibration data model, where the index i is 
over the clusters and not separate sources. Various techniques 
can be used to solve this non-linear data model and one of the 
more popular of them is the Least Squares (LS) optimization 
algorithm which is discussed along other methods in [[13] [l4l . 

3. CLUSTERING OF RADIO SOURCES 

Suppose that the K sources, xi , . . . , xk have equatorial 
coordinates (Right Ascension a. Declination S) equal to 
Si), . . . , {ax, Sk)- The aim is to find the optimum Q 
clusters so that the objective function / = X]^=i D{Lq) is 
minimized. D{Lq) is the angular diameter of cluster Lq, for 
9 e {1, . . . , Q}, defined as 

D{Lq) =max{d{xi,Xj)\xi,Xj G Lq}, (7) 

and , .) is the angular separation between any two points on 
the celestial sphere. Having two radio sources a and b with 
equatorial coordinates {aa, Sa) and (at, Sb), respectively, the 
angular separation d{a, b), in radians, is obtained by 

A/cos^^bSin^Ao! + [cos^asin^^ — sin^aCOs^^cosAaj^ 
tan~^-^^ . , . , , 



sin^asin^fe + cos^qCos^^cosAq; 

(8) 

where Aa = at — aa. 

To get the most information from the strongest observed 
signals in calibration, the centroids of the clusters should 
lean towards the brightest sources. Therefore, for defin- 
ing the centroids, we associate a weight to the i-th source, 
i e {1,.. .,K}, as 

' ^ (9) 



Wi = WiXi) = 



fpt 



Jpj. 



(5) 



where U is the source's intensity and /* = min {/i , . . . , /x}. 

We cluster radio sources using weighted K-means and di- 
visive hierarchical clustering algorithms. Since the source 
clustering for calibration is performed offline, its computa- 
tional complexity is negligible compared with the calibration 
procedure itself. Both of the clustering methods are hard clus- 
tering techniques which divide data to distinct clusters. How- 
ever, we expect more accurate results using fuzzy (soft) clus- 
tering, which constructs overlapping clusters with uncertain 
boundaries. Application and performance of this type of clus- 
tering will be explored in future work. 



3.1. Weighted K-means clustering algorithm 

Stepl. Select the Q brightest sources, x\* , . . . , xq*, and ini- 
tiaUze the centroids of Q clusters by their locations as 

Cq = [aq* Jq*], for g G {1, . . . , Q}, G {1* , . . . , Q*}. 

(10) 

Step2. Assign each source to the cluster with the closest cen- 
troid, defining the membership function 



if d{xi,Cq) =mm{d{xi,Cj)\j = 
Otherwise 



Step3. Update the centroids by 



Cn = 



forqe{l,...,Q}. (11) 



Repeat steps 2 and 3 until there are no reassignments of 
sources to clusters. 

3.2. Divisive hierarchical clustering algorithm 

Stepl. InitiaUze the cluster counter Q' tol, assign all the K 
sources to a single cluster Li and to a set of null clusters 
A. 

Step2. Choose cluster Lq*, for q* G {1, • . • , Q'} - A, with 
the largest angular diameter 

DiLq.) = max{D{Lq)\q G {1, . . . , Q'} - A}. (12) 

Step3. Apply the presented weighted K-means clustering 
technique to split Lq* into two clusters, L^* and .L^* 

Step4. lfD{L'^.) + D{L'^.) < L'(V), then set = Q' + l, 
Lq* = Lg*, Lq> = Lq*, and A = 0, otherwise set 
A = Au{q*}. 

Repeat steps 2, 3, and 4 until Q' = Q. 

4. ILLUSTRATIVE EXAMPLE 

We consider the calibration of data obtained by LOFAR us- 
ing 25 stations (receivers). The observation is centered at the 
radio source 3C196 and has an integration time of 6 hours. 
The central as well as the four brightest sources were initially 
subtracted and the result is shown in Fig. [T] We subsequently 
processed the same data using the classical calibration, in 
the direction of the 8 bright sources, and the aforementioned 
clustered calibration, with 10 clusters produced by weighted 
K-means and divisive hierarchical clustering methods, on 30 
seconds time intervals. For all the methods the LS optimiza- 
tion is used with 9 iterations and the residual images, zoomed 
into the area enclosed by the white window in Fig. [T] are 
shown in Fig. [2j The inset figures focusing on one of those 8 
sources, randomly chosen, show that the source has been con- 
siderably better subtracted in the case of clustered calibration. 
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Fig. 1. Ten averaged channels synthesis image of a 6 hour 
long LOFAR 3C196 observation. The central source (3C196, 
peak flux is 70 Jy) plus the four brightest sources have been 
removed. Approximately 69 sources can be seen after the 
subtraction. The noise level is 6 mJy. 
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Table 1. The RMS of the residual images for the LS calibra- 
tion without clustering (classical calibration), and by using 
Hierarchical (HC) as well as Weighted K-means Clustering 
(WKC) of the sources. The letters A, B, C and D correspond 
to the regions demarcated by the boxes in Fig. [T] 



while in the case of classical calibration there is a significant 
residual error remaining. Table [Hpresents the comparison of 
the Root Mean Squared (RMS) of the residual maps for dif- 
ferent regions and the fuU images produced by the classical 
and clustered calibration methods. The clustered calibration 
method has a lower RMS in all the cases. 

5. CONCLUSIONS 

We have introduced a clustered calibration scheme for cal- 
ibrating radio interferometric data towards the sensitivity 
limit. The method upgrades the coherencies of individual 
sources by their total amount obtained at each source clus- 
ter. Then, it applies calibration to these new coherencies that 
carry a higher level of information compared with the initial 




-0.05 0.05 -0.05 0.05 



Fig. 2. Zoomed in images obtained from the white window 
in Fig. [T] The top row images are the initial image (left) and 
the residual image after subtracting 8 sources by the classical 
LS calibration (right). The residual images of the clustered 
calibration using Hierarchical (left) and Weighted K-means 
(right) clustering methods with 10 source clusters are shown 
at the bottom row. The inset figures show a blow-up of one of 
the 8 sources solved by the LS calibration. 

ones. Therefore, for calibration of sources bellow the noise 
level it has a considerably better performance compared with 
un-clustered calibration techniques. Divisive Hierarchical as 
well as Weighted K-means clustering methods are used to 
exploit the spatial proximity of the sources. It is also shown 
by an illustrative example that the RMS at different regions 
of the clustered calibration's residual images is consistently 
lower, when compared to the un-clustered calibration, which 
reveals its superiority at a low SNR. Hierarchical clustering 
provides a marginally better result since it constructs clusters 
of smaller angular diameters and thus it assigns the same 
calibration solutions to sources that have smaller angular 
separations. Future work will address the estimation of the 
optimum number of clusters, the performance of fuzzy clus- 
tering in the the clustered calibration, and combination of 
clustering with different calibration methods. 
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